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The prevalence and incidence of type 
2 diabetes, representing >90% of all 
cases of diabetes, are increasing rap- 
idly throughout the world. The Interna- 
tional Diabetes Federation has estimated 
that the number of people with diabetes is 
expected to rise from 366 million in 201 1 
to 552 million by 2030 if no urgent action 
is taken. Furthermore, as many as 183 
million people are unaware that they 
have diabetes (www.idf.org). Therefore, 
the identification of individuals at high 
risk of developing diabetes is of great im- 
portance and interest for investigators 
and health care providers. 

Type 2 diabetes is a complex disorder 
resulting from an interaction between 
genes and environment. Several risk factors 
for type 2 diabetes have been identified, 
including age, sex, obesity and central 
obesity, low physical activity, smoking, 
diet including low amount of fiber and 
high amount of saturated fat, ethnicity, 
family history, history of gestational di- 
abetes mellitus, history of the nondiabetic 
elevation of fasting or 2-h glucose, elevated 
blood pressure, dyslipidemia, and different 
drug treatments (diuretics, unselected 
(3-blockers, etc.) (1-3). 

There is also ample evidence that type 
2 diabetes has a strong genetic basis. The 
concordance of type 2 diabetes in mono- 
zygotic twins is -70% compared with 20- 
30% in dizygotic twins (4). The lifetime 
risk of developing the disease is -40% in 



offspring of one parent with type 2 diabe- 
tes, greater if the mother is affected (5), 
and approaching 70% if both parents 
have diabetes. In prospective studies, we 
have demonstrated that first-degree fam- 
ily history is associated with twofold in- 
creased risk of future type 2 diabetes 
(1,6). The challenge has been to find ge- 
netic markers that explain the excess risk 
associated with family history of diabetes. 

Advances in genotyping technology 
during the last 5 years have facilitated 
rapid progress in large-scale genetic stud- 
ies. Since 2007, genome-wide association 
studies (GWAS) have identified >65 ge- 
netic variants that increase the risk of type 
2 diabetes by 10-30% (7,8). Most of these 
variants are noncoding variants, and 
therefore their functional consequences 
are challenging to investigate. Many of 
the variants identified to date regulate in- 
sulin secretion and not insulin action in 
insulin-sensitive tissues. 

In a review by Noble et al. (3), a total 
of 43 different studies were presented 
where nongenetic prediction models for 
type 2 diabetes, including known risk fac- 
tors for type 2 diabetes with different 
combinations, had been analyzed. Het- 
erogeneity of data and highly variable 
methodology of primary studies pre- 
cluded meta-analysis. Altogether, 84 dif- 
ferent risk prediction models were 
presented in 43 studies. C statistics varied 
from 0.60 to 0.91 (from 0.60 to 0.69 in 5 



models, from 0.70 to 0.79 in 44 models, 
from 0.80 to 0.89 in 32 models, and 
&0.90 in 3 models). These results indi- 
cate that clinical, laboratory, and other 
easily collected information by interview 
constitutes in most cases a solid basis for 
nongenetic prediction models in type 2 
diabetes. 

Identification of a large number of 
novel genetic variants increasing suscep- 
tibility to type 2 diabetes and related traits 
opened up opportunity, not existing thus 
far, to translate this genetic information to 
the clinical practice and possibly improve 
risk prediction. However, available data 
to date do not yet provide convincing evi- 
dence to support use of genetic screening 
for the prediction of type 2 diabetes. 

In this review, we summarize the 
current evidence on the role of genetic 
variants to predict type 2 diabetes above 
and beyond nongenetic factors and dis- 
cuss the limitations and future potential 
of genetic studies. 

Genetic prediction models for type 2 
diabetes: evidence from cross-sectional 
and longitudinal studies 

Several studies have indicated that differ- 
ent genetic variants (single nucleotide 
polymorphisms [SNPs]) are associated 
with type 2 diabetes. Genetic risk models 
for type 2 diabetes, based on both cross- 
sectional (9-17) and longitudinal (1,1 7— 
24) studies, are summarized in Table 1. 
Cross-sectional studies. In cross- 
sectional studies including 3,000-9,000 
individuals with and without type 2 
diabetes, the discriminatory ability of 
the combined SNP information has been 
assessed by grouping individuals based 
on the number of risk alleles and deter- 
mining relative odds of type 2 diabetes, as 
well as by calculating the area under the 
receiver operating characteristic curve 
(AUC). As shown in Table 1, the AUC of 
the genetic risk score (GRS), which com- 
bines the information from all risk var- 
iants included in the study, has ranged 
from 0.54 to 0.63, indicating that genetic 
factors have limited use in predicting an 
individual's risk of the disease. In con- 
trast, the AUC has been considerably 
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Table 1 — Comparison of clinical and genetic prediction models for type 2 diabetes 



Lyssenko and Laakso 



Studies 


N 


NSNPs 


AUC GRS 


AUC clinical model 


AUC GRS plus 
clinical model 


Significant 
improvement 


Prevalent type 2 diabetes 


Lango et al. (11) 


4,907 


18 


0.60 


0.78 


0.80 


Yes 


Lin et al. (12) 


5,360 


15 


0.59 


0.86 


0.87 


Yes 
















Wang et al. (16) 


7,232 


19 


0.55 


0.727 


0.730 


No 


Qi et al. (14) 


3,210 


17 


0.62 


0.77 


0.79 




Miyake et al. (13) 


4,686 


11 


0.63 


0.68 


0.72 


NA 


Jampalli et al. (10) 


3,357 


32 


0.634 


0.959 


0.963 


Yes 


Xuet al. (17) 


4,025 


4 


NA 


0.714 


0.73 


NA 


Incident type 2 diabetes 


Lyssenko et al. (1) 


16,061 


11 


0.63 


0.74 


0.75 


Yes 


Meigs et al. (20) 


2,377 




0.581 


0.900 


0.901 


No 
















<50 years 








0.908 


0.911 


No 


>50 years 








0.883 


0.884 


No 


Balkau et al. (18) 


3,817 


2 


NA 


0.850 (M), 0.917 (W) 


0.851 (M), 0.912 (W) 


No 




Schulze et al. (21) 


2,500 


20 


NA 


0.8626 


0.8628 


No 


Talmud et al. (22) 


5,535 


20 


0.54 


0.78 


0.78 


No 


Vaxillaire et al. (24) 


3,442 


3 


0.56 


0.82 


0.83 


NA 










0 634 


0 663 




Nested case-control 





M, men; NA, information not avaifable; W, women. 



larger (from 0.61 to 0.95) for clinical 
models including different combinations 
of clinical and laboratory parameters (age, 
sex, and BMI in all models and family his- 
tory of diabetes and fasting glucose in 
most of the models) predicting the risk 
of type 2 diabetes. Adding the GRS in 
the same model shows that in addition 
to clinical and laboratory parameters, 
risk variants increase only minimally the 
predictive value at the population level, 
although the model improvement could 
be statistically significant (P < 0.05) in 
some cases. 

Perhaps the most important clinical 
question in cross-sectional studies is try- 
ing to identify undiagnosed individuals 
with type 2 diabetes. We addressed this 
question in our large population-based 
Metabolic Syndrome in Men (METSIM) 
Study (16). We identified undiagnosed 
type 2 diabetic patients using the Finnish 
Diabetes Risk Score alone (25), which was 
the best single indicator of prevalent un- 
diagnosed diabetes among all variables 
tested in our study. The AUC based on 
logistic regression models for the identi- 
fication of previously undiagnosed type 
2 diabetic subjects with the Finnish 



Diabetes Risk Score alone was 0.727, and 
it was 0.772 after adding total triglycerides, 
HDL cholesterol, adiponectin, and alanine 
transaminase in the model. Adding type 2 
diabetes risk alleles (20 SNPs) did not fur- 
ther improve the model (0.772) (16). 
Therefore, in our study common genetic 
variants did not seem to add any informa- 
tion on the identification of people having 
undiagnosed diabetes. 
Longitudinal studies. Longitudinal 
studies can address the question of what 
the nongenetic and genetic risk factors 
predicting incident type 2 diabetes are. 
Several large population-based follow-up 
studies have been published aiming to 
investigate the predictive power of com- 
mon genetic variants on the risk of in- 
cident type 2 diabetes (Table 1). These 
studies, including genetic information 
from 2 to 40 SNPs, reported results sur- 
prisingly similar to those from cross- 
sectional case-control studies. Estimates of 
C statistics have ranged from 0.54 to 0.63. 
Different clinical predicting models gave/ 
provided more significant C statistic values 
from 0.63 to 0.917, which are also quite 
similar to those based on cross-sectional 
studies. Risk variants did not essentially 



increase the AUC to predict type 2 diabe- 
tes when combined with clinical risk fac- 
tors. In one study, type 2 diabetes risk 
prediction of a combined clinical and ge- 
netic model was somewhat better in youn- 
ger (<50 years) than in older (>50 years) 
individuals (19) and in women than in 
men (18). Most of these prospective stud- 
ies were performed in Caucasian popula- 
tions, with only one in Chinese (17). 

Are genetic prediction models for 
type 2 diabetes worthless? 

Both cross-sectional and longitudinal 
studies published thus far (Table 1) dem- 
onstrate that genetic screening for the pre- 
diction of type 2 diabetes in high-risk 
individuals is currently of little value in 
clinical practice. Table 2 lists several lim- 
itations of GRSs published (Table 2). 
Small effect size of genetic loci. Effect 
sizes of common genetic variants for type 
2 diabetes identified to date are rather 
modest, ranging from 10 to 35% (7,8). An 
attempt to compose a GRS combining 
several genetic variants has shown 
only a 10-12% increased risk of disease 
with increasing number of the risk alleles. 
In the Malmo Preventive Project study (1), 
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Table 2 — Limitations and potential of GRS studies 



Limitations 

Small effect size of genetic loci 

Low discriminative ability of the GRS 

Small added value of GRS compared with clinical risk factors 

Questionable clinical relevance of some genetic variants in disease prediction 

Lack of appropriate models for studies of gene-gene and gene-environment interactions in 

risk prediction 
Potential in the future 

Genetic studies will help to subtype individuals with diabetes 

New sequencing techniques will identify low-frequency and rare variants with large 

effect sizes 

Studies in non-European ancestry populations will help to identify new variants relevant to 

type 2 diabetes prediction 
Studies of structural variation and epigenetics may help to identify new variants relevant to 

type 2 diabetes prediction 
Large population-based studies and development of statistical methods will improve 

analyses of gene-gene and gene-environment interactions 



the effect was approximately twofold in- 
creased when carriers of the highest and 
the lowest number of risk alleles were 
compared (top 20% ^12 vs. bottom 
20% ^8 risk alleles). Increasing the num- 
ber of novel genetic variants up to 40 did 
not seem to largely improve the risk pre- 
diction (19). The observed modest effect 
sizes could be partially attributed to the 
fact that low frequency or rare variants 
have not yet been reported. Also, it is 
worth mentioning that the majority of 
the identified loci from GWAS are not, in 
fact, genes. The type 2 diabetes-associated 
loci represent an associated SNP, and 
there are still no data on whether the 
top associated signal represents the 
"causal gene" — much less the "causal 
variant." 

Low discriminative ability of the GRS. 

A good diagnostic test in clinical practice 
has high sensitivity and specificity. Con- 
sistently across all studies, the C statistics 
of the AUC for genetic models are typi- 
cally -0.60 suggesting that a genetic test 
performs just a little better than flipping a 
coin. These results demonstrate that the 
performance of a genetic test remains 
rather poor even after adding all recently 
identified genetic variants in the model 
(19). This may not be very surprising, 
since these variants explain only -10- 
15% of the heritability of type 2 diabetes 
(7). The application of the genome-wide 
effects taking into account all SNPs and 
not just those that reach a Bonferroni 
level of significance has recently been 
used in studies on height (26). The results 
suggested that this approach can reduce 
the amount of missing heritability and 



may permit a better GRS. However, the 
clinical utility of this application for the 
prediction of type 2 diabetes needs to be 
tested and validated. Finally, type 2 dia- 
betes represents a heterogeneous condi- 
tion defined by hyperglycemia, and there 
may be several subtypes of diabetes yet to 
be defined. Genetic variants operating 
through different pathways in the disease 
pathogenesis, such as obesity, and con- 
tributing to variation in glycemic traits 
together may have greater predictive 
value for diabetes and its different sub- 
types. Therefore, the analyses evaluating 
prediction models based on all reported 
variants associated with type 2 diabetes 
(8) and glycemic traits including glucose 
and insulin levels during an oral glucose 
tolerance test (OGTT) (27) but also obe- 
sity (28) should be performed. 
Small added value of GRS compared 
with clinical risk factors. Another ques- 
tion that rises about the usefulness of a 
genetic screening in clinical practice is 
whether genetic information improves 
the discriminative accuracy of a test using 
traditional routine clinical risk factors 
alone. Both prospective and cross- 
sectional studies have reported somewhat 
different discriminatory values across dif- 
ferent studies depending on study ascer- 
tainment (inclusion and exclusion criteria 
of different metabolic risk factors), the 
length of the follow-up period in the 
prospective cohorts, obesity, and the 
presence of family history of diabetes. 
A consistent finding in all of these has 
been that GRS has added very little to the 
information provided by clinical risk fac- 
tors alone. Thus, the addition of data from 



genotyped genetic variants to the clinical 
model only slightly improved the discrim- 
inative power of the AUC in the largest 
prospective studies from 0.74 to 0.75 in 
the Swedish Malmo Preventive Project 
study (1), from 0.900 to 0.901 in the Fra- 
mingham Offspring study (20), and from 
0.66 to 0.68 in the Rotterdam study (23). 
One explanation for these findings could 
be that clinical risk factors themselves, 
such as obesity and elevated glucose lev- 
els, harbor a substantial genetic compo- 
nent, and therefore different GRS models 
underestimate the true significance of ge- 
netic variation as a predictor for type 2 di- 
abetes. 

Questionable clinical relevance of some 
genetic variants in disease prediction. 

Once genetic loci are identified in the 
case-control studies, it is very important 
to validate their ability to predict disease 
in prospective studies. Prospective stud- 
ies represent a more controlled setting 
where both case and control subjects are 
ascertained in the same way and have 
similar environmental exposure and 
therefore give the true incidence of the 
disease in a population. In the Malmo 
Preventive Project study (1), 11 of 16 ge- 
netic loci studied, in the Framingham 
Offspring study (20) 2 of 18, and in the 
Rotterdam study (12) 9 of 18 were asso- 
ciated with the risk of developing future 
type 2 diabetes. These results may sug- 
gest that not all genetic variants that 
were significantly associated with type 2 
diabetes in case-control studies are clini- 
cally relevant in the processes responsible 
for the conversion to type 2 diabetes. 
However, we could not rule out a lack 
of power, since similar observations 
have also been made in case-control stud- 
ies. We are currently conducting the larg- 
est to date meta-analysis of prospective 
cohorts in European consortia (ENGAGE) 
including a total of -55,000 individuals, 
followed for > 15 years, to increase sample 
size and, thus, improve statistical power. 
Our preliminary findings support the 
notion that the validation and characteriza- 
tion of genetic variants identified in case- 
control studies should be performed before 
any claims of their clinical relevance are 
made. 

Lack of appropriate models for studies 
of gene-gene and gene-environment 
interactions in risk prediction. There is 
very little information on how much 
gene-gene and gene-environment inter- 
actions contribute to the prediction of 
type 2 diabetes. The success in the appli- 
cation of the methodological techniques 
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to study epistatic effects in different pop- 
ulations has been limited. Given the 
excessive calculation and power capacity 
required for running these tests, research- 
ers have mainly studied interaction be- 
tween genomic loci that have already been 
found (29). However, studies in plants 
and animals clearly demonstrate that ep- 
istatic/interactions effects are often detec- 
ted in the absence of main effects (30). 
Our recent studies demonstrate that the 
risk of disease conferred by genetic var- 
iants might be neutralized by their con- 
comitant beneficial effects in other key 
organs and tissues involved in the patho- 
genesis of type 2 diabetes or having dif- 
ferent responses to nutrition — so-called 
pleiotropic effects (31,32). For example, 
insulin secretion reducing effect of a ge- 
netic variant in GIPR is ameliorated by its 
beneficial effects on body composition, 
including BMI, waist, and fat mass (31). 
Furthermore, the carriers of the GIPR var- 
iant seem to respond differently to food 
rich in carbohydrates and fat (32,33). 
Similar observations have been reported 
for an interaction between a variant in the 
FTO gene and physical activity on the risk 
of obesity and cardiovascular diseases 
(34,35). Carriers of the obesity-associated 
allele in FTO have a higher risk for cardio- 
vascular risk only in women who are 
physically inactive but not in those who 
are physically active, suggesting that the 
risk for developing cardiovascular disease 
can be prevented or delayed in the risk 
allele carriers if they are physically active. 
Thus, defining the nature of the gene- 
gene and gene-environment interactions 
can clearly help to improve prediction 
and identify persons at increased risk of 
type 2 diabetes (36). 

Genetic prediction models for 
type 2 diabetes can be valuable 
in the future 

Previously published genetic studies have 
severe limitations that underestimate the 
true significance of genetic variants in 
predicting type 2 diabetes (Table 2). 
Genetic prediction models can be im- 
proved by increasing the precision of the 
diagnosis of type 2 diabetes, by identifica- 
tion of low-frequency and rare genetic var- 
iants, by identification of risk variants for 
type 2 diabetes in non-European ancestry 
populations, by increasing knowledge on 
structural variation and epigenetics, and 
by developing statistical techniques to eval- 
uate gene-gene and gene-environment 
interactions. 



Necessity of improving the precision 
of the diagnosis of individuals with 
diabetes. Type 2 diabetes is a chronic 
hyperglycemic condition that is not type 

1 diabetes or other subtypes of diabetes, 
which include genetic defects of insulin 
secretion and action, diseases of exocrine 
pancreas, endocrinopathies, drug- or 
chemically induced diabetes, diabetes in 
connection with infections, uncommon 
forms of immunomediated diabetes, 
other genetic syndromes sometimes asso- 
ciated with diabetes, or gestational di- 
abetes mellitus (37). In other words, there 
is no precise definition of type 2 diabetes. 
In fact, this main subtype of diabetes is 
defined by excluding all other conditions 
leading to chronic hyperglycemia. 

Differential diagnosis between differ- 
ent subtypes of diabetes is challenging, 
especially between type 2 diabetes and 
late-onset and slowly developing type 1 
diabetes. Patients having this subtype of 
diabetes, also called latent autoimmune 
diabetes in adults, have a progressive 
insulin secretion defect, share a genetic 
predisposition with both type 1 and type 

2 diabetic patients, and are often diag- 
nosed erroneously as type 2 patients (38). 
These patients, positive for GAD antibod- 
ies, may include -10% of all diabetic 
patients (39) and are the most important 
subtype of diabetes leading to misclassifi- 
cation of diabetic patients. Additionally, 
recent exome sequencing studies have 
demonstrated that there is a continuously 
increasing number of monogenic forms of 
diabetes, which implies that the definition 
of type 2 diabetes in previous genetic 
studies may have been imprecise 
(40,41). Thus, it is very likely that every 
study population includes a varying num- 
ber of individuals who have monogenic 
diabetes and who have been misclassified 
as having type 2 diabetes. Finally, it is im- 
portant to note that several large-scale 
case-control or cohort studies have not ap- 
plied an OGTT, which implies that their 
nondiabetic control group includes a vary- 
ing number of individuals having type 2 
diabetes. Imprecise classification of indi- 
viduals with diabetes into subtypes and 
poor diagnostic procedures to find or ex- 
clude individuals with diabetes have con- 
siderably weakened the power of previous 
genetic prediction models. More careful 
phenotyping and classification of partici- 
pants into different subtypes of diabetes 
are needed in future studies aiming to im- 
prove genetic prediction models. Dy- 
namic measures of (3-cell function (i.e., 
glucose-stimulated insulin secretion 



during an OGTT) and insulin resistance 
(i.e. , during clamp) among nondiabetic in- 
dividuals will be largely insightful for the 
design of future studies. 
New sequencing techniques will identify 
low-frequency and rare variants with 
large effect sizes. Genome -wide associ- 
ation studies are based on the "common 
disease, common variant" hypothesis, as- 
suming that common diseases are attrib- 
utable in part to allelic variants present in 
>5% of the population (42). These stud- 
ies have been able to identify only rela- 
tively common variants that essentially 
contributed to the generation of different 
genetic risk models for complex diseases, 
including type 2 diabetes. Therefore, 
new technologies (exome sequencing, 
custom-made exome chips) are needed 
to identify low frequency (<5%) or rare 
(<0.5%) variants having larger effect 
sizes that could potentially explain a 
part of the "missing heritability" (43). 
Importantly, as previously mentioned, 
the GRS that emerges from GWAS may 
not, in fact, be using the "true" causal 
variant (or may not even be in the true 
causal gene). As a result, through fine- 
mapping and sequencing, perhaps the 
true genes/variants can be identified 
and, with use of these in a GRS, the pre- 
diction ability might increase. It has been 
estimated that 20 variants with risk allele 
frequency of 1% and allelic odds ratio of 
3.0 could account for most familial ag- 
gregation of type 2 diabetes (43). Results 
from exome sequencing and custom- 
made exome chip studies soon to be pub- 
lished will clarify the role of variants 
with a population frequency <5% in 
chronic diseases, including type 2 diabe- 
tes. This work will be facilitated by the 
comprehensive catalog of variants with 
the minor allele frequency >1% gener- 
ated by the 1000 Genomes Project 
(http://www.1000genomes.org/page. 
php). Identification of low frequency and 
rare variants makes it possible to search 
for causal variants in gene regions having 
simultaneously common variants associ- 
ated with the disease. 

Studies on monogenic forms of di- 
abetes have clarified the relative impor- 
tance of rare mutations having large 
effects sizes versus common SNPs having 
small effect sizes. Lango et al. (44) 
included a total of 410 individuals having 
causal mutations in the hepatic nuclear 
receptor la (maturity-onset diabetes of 
the young 3) in their study. They 
generated a single GRS representing the 
combined genetic susceptibility for type 2 
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diabetes, based on 17 SNPs known to in- 
fluence the risk of type 2 diabetes. Each 
additional type 2 diabetes risk allele was 
associated with a 0.35-year reduction in 
age at diagnosis (P = 0.005) in all individ- 
uals and with a 0.28-year reduction in un- 
related probands (P = 0.094). These 
results imply that the age of onset of 
monogenic diabetes caused by rare muta- 
tions having large effect sizes is not sub- 
stantially modified by common polygenic 
variants. This example emphasizes the 
potential significance of rare variants hav- 
ing large effect sizes over common var- 
iants having small effect sizes in the risk 
prediction of diabetes. 
Studies on non-European ancestry 
populations will help identify new 
variants relevant to type 2 diabetes 
prediction. Most of the GWA studies 
have been performed in European ances- 
try populations, and therefore current 
type 2 diabetes genetic risk models are 
not likely to be applicable to all popula- 
tions. Genetic variation is greatest in re- 
cent African ancestry populations (45), 
but there are no large GWAS where risk 
variants for type 2 diabetes in African 
populations have been investigated in de- 
tail. This information could greatly facili- 
tate the identification of trait-defining 
variants as shown by a recent study in 
an African American type 2 diabetic 
case-control population (46). The inves- 
tigators resequenced the critical chromo- 
somal region for association and by 
haplotype analysis showed that rs7903146, 
originally found in Caucasian populations, 
was indeed a causal variant and sufficient to 
explain the haplotype association. The 
identification of causal variants, instead of 
their originally identified proxy SNPs, can 
potentially improve type 2 diabetes predic- 
tion models (47). Furthermore, the differ- 
ences in genetic architecture among the 
populations could help to identify variants 
that are relatively rare in the Europeans but 
are more common in other ethnic groups. 
Thus, for example, the KCNQ1 gene was 
first identified in Asians where the minor 
allele frequency of the associated variants 
ranged between 30 and 40%, which was 
much higher than in Europeans with a fre- 
quency 10% (48,49). 
Studies of structural variation and 
epigenetics may help identify new variants 
relevant to type 2 diabetes prediction. 
The contribution of structural variation, 
including copy no. variants (CNVs) (in- 
sertions and deletions) and copy neutral 
variants (inversions and translocations), 
to the risk of type 2 diabetes is poorly 



known. To date, robustly replicated find- 
ings of CNVs associated with type 2 
diabetes have not been reported. The 
reason for this is, in part, that most of 
the CN V analysis has been based on CNVs 
that are "tagged" by GWAS SNPs and thus 
covering only a small well-behaved geno- 
mic regions (in Hardy- Weinberg equilib- 
rium), whereas the amount of "structural 
dark matter" remains relatively un- 
touched by arrays or sequencing of those 
genomic regions that are not amenable to 
GWAS arrays (48). Next-generation se- 
quencing has a considerably better poten- 
tial than conventional sequencing to find 
structural variation, which could contrib- 
ute to the understanding of the genetics of 
type 2 diabetes. However, it is not very 
likely that CNVs play a major role in the 
genetics of type 2 diabetes, given the fact 
that CNVs have been estimated to affect 
up to 5% of the human genome (43). 

Epigenetics means heritable changes in 
gene function attributable to chemical 
modifications of DNA and its associated 
proteins, independent of the DNA se- 
quence. The most investigated epigenetic 
modifications are methylation of cytosine 
residues in DNA and histone modifications 
(50). Changes in DNA methylation have 
been shown to be linked with some var- 
iants increasing the risk of type 2 diabetes. 
Hypermethylation of the maternal allele of 
KCNQ1 results in monoallelic activity of 
the neighboring maternally expressed pro- 
tein-coding genes and is associated with the 
risk of type 2 diabetes (51). Similarly, ma- 
ternally expressed KLFJ 4 only increases the 
risk when carried on the maternal chromo- 
some and acts as a master trans regulator of 
adipose tissue expression (52). These ex- 
amples demonstrate the possibility that 
several other genes, yet to be discovered, 
can contribute to the risk of type 2 diabetes 
via epigenetic mechanisms. Combining the 
advantages of GWAS and epigenome anal- 
yses might pave the way to better under- 
standing of the pathogenesis of type 2 
diabetes and improve genetic risk models. 
Unfortunately, methods to estimate whole- 
genome methylation are still under devel- 
opment and catch only a minor fraction 
of all methylation sites. Technical im- 
provements in near future might make 
genome-wide methylation scans more ex- 
tensive and reliable. 

Large population-based studies and 
development of statistical methods will 
improve analyses of gene-gene and 
gene-environment interactions. Most 
previous studies on the genetics of type 
2 diabetes, especially before the era of 



GWAS, applied a single-locus analysis 
strategy and thus ignored interactions. 
Recent advances in genotyping have con- 
siderably improved the opportunity to 
investigate the genetic architecture of type 
2 diabetes and have made it possible to 
perform meta-analyses of several popula- 
tion-based studies often including 
> 100,000 participants. Although these 
studies exhibit considerable heterogene- 
ity, which weakens their power, they 
have paved the way to studies of gene- 
gene and gene-environment interactions. 
Recent advances include Metabochip, a 
custom-made Illumina array (Illumina, 
San Diego, CA), including 217,000 
SNPs, and Illumina Human Exome Bead- 
Chip including >250,000 putative func- 
tional exonic variants that are especially 
suited for genetic studies of type 2 diabetes. 
These large populations allow meta-analyses 
based on identical genetic platforms, which 
minimize the heterogeneity of genotyping 
results. 

Gene-gene and gene-environment in- 
teraction analyses based on large popula- 
tions increase the power to detect novel 
variants and more accurately characterize 
the genetic effects. They also may help to 
elucidate the biological and biochemical 
pathways responsible for complex disea- 
ses, e.g., type 2 diabetes, and identify the 
environmental effects. Risk prediction 
models including significant interactions 
also improve disease risk prediction. In- 
teraction analyses require sophisticated 
statistical methods to analyze genetic 
interactions. For example, exhaustive 
evaluation of all two-marker models in 
GWAS data are already challenging, given 
the fact that 5 X 10 _n possible models 
from a set of 1 million SNPs need to be 
calculated (53). The ultimate goal is to in- 
tegrate modern statistical methods with ge- 
netic data and biological knowledge, which 
will further improve the power to detect 
complex interactions (54). 

Conclusions 

Genetic testing for the prediction of type 2 
diabetes in high risk individuals is cur- 
rently of little value in clinical practice. 

The limitations of genetic risk models 
are small effect size of genetic loci, low 
discriminative ability of the genetic test, 
small added value of genetic information 
compared with the clinical risk factors, 
questionable clinical relevance of some 
genetic variants in disease prediction, and 
the lack of appropriate models for studies 
of gene-gene and gene-environment in- 
teractions in the risk prediction. 
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For improvement of the genetic risk 
models in the future, the definition of type 
2 diabetes and classification of subtypes of 
diabetes should be more precise, new 
sequencing techniques should be applied 
to identify low-frequency and rare variants 
having a large effect size, non-European 
ancestry populations should be investi- 
gated to identify new variants relevant to 
type 2 diabetes prediction, studies of struc- 
tural variation and epigenetics should be 
performed to identify new variants relevant 
to type 2 diabetes prediction, and modern 
statistical methods should be developed 
and applied in studies of gene-gene 
and gene-environment interaction in large 
populations. 
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